This dataset is looking at the species richness of orchid flora and their response to 30 different environmental factors in 3500 grid cells in the Beipan River Basin in Guizhou Province. For this exercise I selected species richness as the response variable and mean annual temperature, mean annual precipitation, elevation, human population density, and area of cropland as predictor variables.
We have a high correlation between Mean Annual Temperature and Mean Annual Precipitation, as expected. There is relatively low correlation between the other variables.
model_1 <- lm(richness ~ MAT + MAP + elevation + population + cropland, data=orch)
anova (model_1) #coefficients of the full model
## Analysis of Variance Table
##
## Response: richness
## Df Sum Sq Mean Sq F value Pr(>F)
## MAT 1 37910 37910 172.528 < 2.2e-16 ***
## MAP 1 15221 15221 69.272 < 2.2e-16 ***
## elevation 1 83989 83989 382.232 < 2.2e-16 ***
## population 1 7816 7816 35.570 2.706e-09 ***
## cropland 1 47670 47670 216.945 < 2.2e-16 ***
## Residuals 3493 767528 220
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## # Check for Multicollinearity
##
## Low Correlation
##
## Term VIF VIF 95% CI Increased SE Tolerance Tolerance 95% CI
## MAT 1.75 [1.68, 1.84] 1.32 0.57 [0.54, 0.60]
## MAP 1.76 [1.68, 1.85] 1.33 0.57 [0.54, 0.59]
## elevation 1.19 [1.15, 1.24] 1.09 0.84 [0.81, 0.87]
## population 1.07 [1.04, 1.12] 1.04 0.93 [0.89, 0.96]
## cropland 1.10 [1.07, 1.15] 1.05 0.91 [0.87, 0.94]
Since all variables have a Variance Inflation Factor less than 5, we can conclude that there is a low correlation between the predictor variables.
## Global model call: lm(formula = richness ~ MAT + MAP + elevation + population +
## cropland, data = orch)
## ---
## Model selection table
## (Intrc) crpln elvtn MAP MAT ppltn df logLik AICc delta
## 32 -55.2900 -2.239e-06 0.02160 0.07295 -0.48660 -0.0002350 7 -14395.89 28805.8 0.00
## 24 -51.3200 -2.183e-06 0.02068 0.06371 -0.0002335 6 -14399.45 28810.9 5.10
## 16 -58.3700 -2.320e-06 0.02247 0.07476 -0.47980 6 -14403.56 28819.1 13.33
## 8 -54.4400 -2.264e-06 0.02156 0.06563 5 -14407.00 28824.0 18.20
## 28 13.3600 -2.282e-06 0.01587 1.01500 -0.0003007 6 -14480.54 28973.1 167.29
## 12 11.5800 -2.387e-06 0.01682 1.07200 5 -14492.57 28995.2 189.34
## 23 -68.2500 0.02389 0.07252 -0.0003535 5 -14501.49 29013.0 207.18
## 31 -69.2700 0.02412 0.07469 -0.11170 -0.0003546 6 -14501.31 29014.6 208.83
## 20 29.5600 -2.494e-06 0.01653 -0.0003345 5 -14504.63 29019.3 213.46
## 7 -74.0300 0.02543 0.07599 4 -14518.10 29044.2 238.39
## 15 -74.7700 0.02560 0.07757 -0.08047 5 -14518.00 29046.0 240.21
## 4 28.5800 -2.625e-06 0.01762 4 -14519.39 29046.8 240.97
## 30 -20.5700 -2.550e-06 0.04409 0.26400 -0.0003952 6 -14520.75 29053.5 247.72
## 22 -22.0200 -2.590e-06 0.04874 -0.0004001 5 -14521.80 29053.6 247.80
## 14 -23.4700 -2.711e-06 0.04519 0.32840 5 -14541.44 29092.9 287.09
## 6 -25.3200 -2.764e-06 0.05101 4 -14543.04 29094.1 288.28
## 26 19.5600 -2.523e-06 1.14400 -0.0004109 5 -14553.40 29116.8 311.00
## 10 17.5900 -2.690e-06 1.23400 4 -14575.36 29158.7 352.91
## 18 38.2000 -2.775e-06 -0.0004543 4 -14582.82 29173.6 367.83
## 27 0.7777 0.01830 1.43400 -0.0004242 5 -14584.92 29179.9 374.04
## 11 -2.6150 0.01982 1.54300 4 -14607.83 29223.7 417.86
## 2 37.6300 -2.983e-06 3 -14609.40 29224.8 418.98
## 19 22.8600 0.01959 -0.0004906 4 -14631.68 29271.4 465.57
## 29 -32.0600 0.04221 0.79920 -0.0005552 5 -14649.28 29308.6 502.76
## 21 -37.1700 0.05667 -0.0005783 4 -14658.44 29324.9 519.08
## 3 20.8600 0.02148 3 -14661.83 29329.7 523.84
## 25 6.4890 1.63600 -0.0005686 4 -14677.13 29362.3 556.45
## 13 -37.2900 0.04363 0.94080 4 -14688.01 29384.0 578.21
## 5 -43.6000 0.06085 3 -14700.50 29407.0 601.20
## 9 2.4620 1.81000 3 -14717.12 29440.3 634.44
## 17 32.4000 -0.0006570 3 -14735.28 29476.6 670.75
## 1 30.9100 2 -14787.60 29579.2 773.39
## weight
## 32 0.927
## 24 0.072
## 16 0.001
## 8 0.000
## 28 0.000
## 12 0.000
## 23 0.000
## 31 0.000
## 20 0.000
## 7 0.000
## 15 0.000
## 4 0.000
## 30 0.000
## 22 0.000
## 14 0.000
## 6 0.000
## 26 0.000
## 10 0.000
## 18 0.000
## 27 0.000
## 11 0.000
## 2 0.000
## 19 0.000
## 29 0.000
## 21 0.000
## 3 0.000
## 25 0.000
## 13 0.000
## 5 0.000
## 9 0.000
## 17 0.000
## 1 0.000
## Models ranked by AICc(x)
There are 32 possible models based on additive combinations of variables (no interaction terms).
## Global model call: lm(formula = richness ~ MAT + MAP + elevation + population +
## cropland, data = orch)
## ---
## Model selection table
## (Intrc) crpln elvtn MAP MAT ppltn df logLik AICc delta
## 32 -55.29 -2.239e-06 0.02160 0.07295 -0.4866 -0.0002350 7 -14395.89 28805.8 0.0
## 24 -51.32 -2.183e-06 0.02068 0.06371 -0.0002335 6 -14399.45 28810.9 5.1
## weight
## 32 0.928
## 24 0.072
## Models ranked by AICc(x)
## Global model call: lm(formula = richness ~ MAT + MAP + elevation + population +
## cropland, data = orch)
## ---
## Model selection table
## (Intrc) crpln elvtn MAP MAT ppltn df logLik AICc delta weight
## 32 -55.29 -2.239e-06 0.0216 0.07295 -0.4866 -0.000235 7 -14395.89 28805.8 0 1
## Models ranked by AICc(x)
## cropland elevation MAP population MAT
## Sum of weights: 1.00 1.00 1.00 1.00 0.93
## N containing models: 16 16 16 16 16
We can see that area of cropland, elevation, mean annual precipitation, and human population density are all equally important. Mean annual temperature is less important, but it should still be considered.
##
## Call:
## model.avg(object = dredge_1, revised.var = TRUE)
##
## Component models:
## '12345' '1235' '1234' '123' '1245' '124' '235' '2345' '125' '23'
## '234' '12' '1345' '135' '134' '13' '145' '14' '15' '245'
## '24' '1' '25' '345' '35' '2' '45' '34' '3' '4'
## '5' '(Null)'
##
## Coefficients:
## (Intercept) cropland elevation MAP MAT population
## full -55.00649 -2.235242e-06 0.02153479 0.0722874 -0.4514571 -0.0002345950
## subset -55.00649 -2.235242e-06 0.02153479 0.0722874 -0.4866271 -0.0002348973
#summary(model.avg(dredge_wash)) # if you want to average across all models, both competitive and non-competitive
summary(model.avg(dredge_1, subset = delta < 6)) # if you just want to look only at competitive models, which
##
## Call:
## model.avg(object = dredge_1, subset = delta < 6)
##
## Component model call:
## lm(formula = richness ~ <2 unique rhs>, data = orch)
##
## Component models:
## df logLik AICc delta weight
## 12345 7 -14395.89 28805.82 0.0 0.93
## 1235 6 -14399.45 28810.92 5.1 0.07
##
## Term codes:
## cropland elevation MAP MAT population
## 1 2 3 4 5
##
## Model-averaged coefficients:
## (full average)
## Estimate Std. Error Adjusted SE z value Pr(>|z|)
## (Intercept) -5.500e+01 5.822e+00 5.824e+00 9.445 < 2e-16 ***
## cropland -2.235e-06 1.526e-07 1.527e-07 14.639 < 2e-16 ***
## elevation 2.153e-02 1.361e-03 1.362e-03 15.812 < 2e-16 ***
## MAP 7.229e-02 5.966e-03 5.968e-03 12.112 < 2e-16 ***
## MAT -4.515e-01 2.163e-01 2.163e-01 2.087 0.0369 *
## population -2.349e-04 6.001e-05 6.003e-05 3.913 9.11e-05 ***
##
## (conditional average)
## Estimate Std. Error Adjusted SE z value Pr(>|z|)
## (Intercept) -5.500e+01 5.822e+00 5.824e+00 9.445 < 2e-16 ***
## cropland -2.235e-06 1.526e-07 1.527e-07 14.639 < 2e-16 ***
## elevation 2.153e-02 1.361e-03 1.362e-03 15.812 < 2e-16 ***
## MAP 7.229e-02 5.966e-03 5.968e-03 12.112 < 2e-16 ***
## MAT -4.866e-01 1.825e-01 1.826e-01 2.665 0.0077 **
## population -2.349e-04 6.001e-05 6.003e-05 3.913 9.11e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#is the point of model selection.
#there is justification for looking only at the competitive models; trying to narrow things down.
Unsure why there are negative values for species richness. When including human population density as a predictive effect, we would want to use a different error structure because the prediction line becomes negative for population densities above 50,000 people per square kilometer